MONTE CARLO SIMULATION

1. Overview

Monte Carlo methods are computational techniques that use random sampling to estimate numerical results for problems that are otherwise hard to solve exactly. They are especially useful when dealing with systems that involve uncertainty or complexity that makes traditional calculations difficult. The method was applied in the Manhattan Project to simulate neutron diffusion and nuclear reactions. The term “Monte Carlo” originates from the Monte Carlo Casino in Monaco, highlighting the reliance on chance and randomness.


2. Mathematical Foundation

Core Concept

Monte Carlo methods rely on random sampling to obtain numerical results for deterministic problems:

\[ \text{Estimate} = \frac{1}{N} \sum_{i=1}^{N} f(X_i) \]

Where:-

\(N\) = number of samples \(X_i\) = random variables \(f(X_i)\) = function of interest

Fundamental Theorems

Law of Large Numbers: As \(N \to \infty\), the sample mean converges to the expected value:

\[ \lim_{N\to\infty} \frac{1}{N} \sum_{i=1}^{N} f(X_i) = \mathbb{E}[f(X)] \]

Central Limit Theorem: The sampling distribution approaches normal distribution:

\[ \sqrt{N}(\bar{X}_N - \mu) \xrightarrow{d} N(0, \sigma^2) \] Monte Carlo relies on the principle of estimating deterministic quantities via random sampling:

  1. Model the problem using probability theory.
  2. Generate random inputs from the appropriate distributions.
  3. Apply the function or model to each sample.
  4. Aggregate outcomes to estimate the desired metric.

Example formula for π approximation:

\[ \pi \approx 4 \times \frac{\text{Number of points inside the circle}}{\text{Total number of points}} \]

π Approximation

Monte Carlo can estimate π by randomly sampling points in a unit square and counting how many fall inside a quarter circle.

Conceptual Steps:

  • Generate N random points in [0,1] × [0,1].
  • Count points where \(x^2 + y^2 \le 1\).
  • Ratio of points inside the circle to total points approximates π/4.
  • Multiply by 4 to get π.
# Define different sample sizes to test Monte Carlo estimation accuracy
Samples <- c(1000, 10000, 100000, 1000000)

# Assign variable names for readability
s_1 <- Samples[1]       # 1,000 samples
s_2 <- Samples[2]       # 10,000 samples
s_3 <- Samples[3]       # 100,000 samples
s_4 <- Samples[4]       # 1,000,000 samples

# Set random seed for reproducibility (ensures same random numbers each run)
set.seed(123)

# Choose the number of random points to generate
n_points <- s_1

# Generate n_points random x and y coordinates uniformly between -1 and 1
x <- runif(n_points, -1, 1)
y <- runif(n_points, -1, 1)

# Check whether each (x, y) point lies inside the circle (x² + y² ≤ 1)
inside <- ((x^2 + y^2) <= 1)

# Create a data frame for visualization or analysis
df <- data.frame(x, y, inside)

# Estimate the value of π using Monte Carlo method:
# ratio of points inside the circle to total points × 4 (area ratio)
pi_est <- 4 * sum(inside) / n_points

Key points:

  • Accuracy increases with the number of samples.
  • Convergence is guaranteed by the law of large numbers.
  • Monte Carlo can handle multi-dimensional problems where analytical solutions are difficult.

3. Project: Financial Option Pricing (Geometric Brownian Motion)

# MONTE CARLO SIMULATION FUNCTION
montecarlo_option <- function(S0, K, r, sigma, T, n_paths, n_steps) {
  
  dt <- T / n_steps  # Time step size
  
  # Initialize matrix to store stock price paths
  paths <- matrix(0, nrow = n_paths, ncol = n_steps + 1)
  paths[, 1] <- S0
  
  # Generate standard normal random numbers
  z_matrix <- matrix(rnorm(n_paths * n_steps), nrow = n_paths, ncol = n_steps)
  
  # Simulate stock paths using Geometric Brownian Motion
  for (j in 2:(n_steps + 1)) {
    paths[, j] <- paths[, j - 1] * exp(
      (r - 0.5 * sigma^2) * dt + sigma * sqrt(dt) * z_matrix[, j - 1]
    )
  }
  
  # Calculate option payoffs
  call_payoff <- pmax(paths[, n_steps + 1] - K, 0)
  put_payoff  <- pmax(K - paths[, n_steps + 1], 0)
  
  call_price <- exp(-r * T) * mean(call_payoff)
  put_price  <- exp(-r * T) * mean(put_payoff)
  
  return(list(
    call_price = call_price,
    put_price = put_price,
    paths = paths
  ))
}


# SIMULATION PARAMETERS
S0 <- 100       # Initial stock price
K <- 105        # Strike price
r <- 0.05       # Risk-free interest rate (annual)
sigma <- 0.2    # Volatility (annual)
T <- 1          # Time to maturity (years)
n_paths <- 1000 # Number of Monte Carlo paths
n_steps <- 252  # Number of time steps

set.seed(123)   # For reproducibility

# Run the simulation
option_result <- montecarlo_option(S0, K, r, sigma, T, n_paths, n_steps)

Monte Carlo Option Results
Parameter Value
Call Option Price 8.591995
Put Option Price 7.464224

4. Uses of Monte Carlo Methods

Monte Carlo simulations are applied wherever complex systems involve uncertainty or stochastic behavior. Key use cases include:

  • Physics: particle interactions, radiation transport, and nuclear simulations.
  • Finance: option pricing, risk management, and portfolio optimization.
  • Engineering: reliability analysis, failure probability estimation, and system design.
  • Project Management: risk analysis, sensitivity testing, and forecasting.
  • Statistics/Data Science: probabilistic modeling, integrals, and sampling-based inference.

BENFORD’S LAW ANALYSIS

1. Overview

Benford’s Law, also known as the First-Digit Law, describes the frequency distribution of leading digits in many naturally occurring datasets. Contrary to intuition, lower digits occur more frequently as the first digit. Specifically, the digit 1 appears about 30% of the time, while higher digits (e.g., 9) appear less than 5% of the time. Applications of Benford’s Law include fraud detection in accounting, forensic analysis, and data validation, as deviations from the expected distribution may indicate manipulation or anomalies. The law is named after physicist Frank Benford, who formalized it in 1938, though it was first observed by Simon Newcomb in 1881.


2. Mathematical Foundation

Core Concept

Benford’s Law predicts the probability of each digit \(d\) (1–9) as the first significant digit:

\[ P(D = d) = \log_{10}\left(1 + \frac{1}{d}\right), \quad d \in \{1,2,\dots,9\} \]

Where:-

\(D\) = first significant digit
\(P(D=d)\) = probability of digit \(d\) occurring as the first digit

Benford Law Probabilities
Digit Probability
1 0.301
2 0.176
3 0.125
4 0.097
5 0.079
6 0.067
7 0.058
8 0.051
9 0.046

Fundamental Theorems / Statistical Test

To check whether a dataset follows Benford’s Law, a Chi-square goodness-of-fit test is often used:

\[ \chi^2 = \sum_{d=1}^{9} \frac{(O_d - E_d)^2}{E_d} \]

Where:-

\(O_d\) = observed frequency of digit \(d\) in the dataset
\(E_d = P(D=d) \cdot N\) = expected frequency based on Benford’s Law
\(N\) = total number of observations

Interpretation:
- If \(\chi^2\) is below the critical value for 8 degrees of freedom (since 9 digits − 1), the dataset conforms to Benford’s Law.
- A significantly large \(\chi^2\) indicates deviation, potentially signaling anomalies or fraud.

Steps to Apply Benford Analysis in Practice

  1. Extract the leading digit from each number in the dataset.
  2. Compute the observed frequency for each digit (1–9).
  3. Compute expected frequencies based on Benford’s Law.
  4. Apply the Chi-square test to compare observed and expected frequencies.
  5. Interpret the results: conformity suggests normal patterns; deviation may indicate irregularities.

3. Project: Forensic Analysis of Payroll Data with Benford’s Law

# Create a Dataframe from the data
payroll_df <- read.csv("/home/vinayak/payroll_data.csv", stringsAsFactors = FALSE)

# Check first few rows of the data
head(payroll_df)
##   EmployeeID   Salary
## 1      E0001 12799.80
## 2      E0002 36267.33
## 3      E0003 25374.23
## 4      E0004 10371.87
## 5      E0005 16493.14
## 6      E0006 50297.88
# Create a Benford object for the Salary column (2nd column)
bf_salary <- benford(payroll_df$Salary, number.of.digits = 1)

# Plot 
plot(bf_salary)

Benford Analysis Results
Digit Actual Count Actual (%) Benford (%) Difference (%)
1 75 37.5 30.10 7.40
2 57 28.5 17.61 10.89
3 27 13.5 12.49 1.01
4 17 8.5 9.69 -1.19
5 7 3.5 7.92 -4.42
6 6 3.0 6.69 -3.69
7 2 1.0 5.80 -4.80
8 5 2.5 5.12 -2.62
9 4 2.0 4.58 -2.58

4. Uses of Benford’s Law

Benford’s Law is applied wherever naturally occurring numeric data may reveal patterns or anomalies. Key use cases include:

  • Accounting & Auditing: Detecting fraud or manipulation in financial statements.
  • Forensic Analytics: Spotting anomalies in tax returns, election results, or expense reports.
  • Data Validation: Checking integrity of large datasets like census data or survey results.
  • Economics & Finance: Identifying irregularities in stock prices, market data, and transactions.
  • Scientific Research: Detecting fabricated or biased experimental data.

ALTMAN Z-SCORE ANALYSIS

1. Overview

The Altman Z-Score is a financial metric used to predict the likelihood of a company entering bankruptcy within the next 2 years. Developed by Edward I. Altman in 1968, it combines multiple financial ratios using a weighted linear combination.


2. Mathematical Foundation

The Z-Score is essentially a discriminant function from multivariate statistics, which separates financially distressed firms from healthy firms based on financial ratios.

Core Concept

\[ Z = 1.2X_1 + 1.4X_2 + 3.3X_3 + 0.6X_4 + 1.0X_5 \] \[ \begin{align*} X_1 &= \text{Working Capital / Total Assets} \\ X_2 &= \text{Retained Earnings / Total Assets} \\ X_3 &= \text{EBIT / Total Assets} \\ X_4 &= \text{Market Value of Equity / Total Liabilities} \\ X_5 &= \text{Sales / Total Assets} \end{align*} \]

  • Linear Combination of Ratios: Each ratio is multiplied by a weight reflecting its relative importance in predicting bankruptcy.

  • Threshold-based Classification: The resulting Z-Score is compared against predefined thresholds to categorize company risk.

  • Statistical Background: Altman used multiple discriminant analysis (MDA) to empirically determine the weights of financial ratios using historical company data.


3. Project: Altman Z-Score analysis of an entity over its observable history

Years <- 2015:2024

data <- data.frame(
  Year = Years,
  Working_Capital = c(366.238602998865,323.808647420777,343.087078290206,321.563793374437,259.064022551688,286.40312279824,211.193492518817,146.543957821955,120.787468956851,99.1268239420409),
  Total_Assets = c(1188.90726976097,1170.68971353583,1107.21158352681,1138.64022530615,907.384105352685,1043.33879132755,1127.53786125686,964.922380750068,995.454302290455,969.487735605799),
  Retained_Earnings = c(231.951533932914,296.766739226545,321.318004328533,250.275203271019,182.746136600687,194.249059878592,171.898617828878,146.832374171086,115.195439756733,166.633505768519),
  EBIT = c(104.271432117896,172.407375222222,257.541292546592,100.205467459213,157.800385275683,113.592141257633,79.931630147092,108.317993085784,114.87317938569,63.7215852151974),
  Market_Value_Equity = c(986.25972635745,683.829161610882,884.281041977552,674.475781725616,823.487712715455,742.985139989434,703.76548497861,478.018207698149,451.725597007781,335.168616938647),
  Total_Liabilities = c(801.790063455701,751.688452623785,784.072960540652,500.249909330159,690.126629639417,588.047554064542,651.926615089178,745.108401309699,640.719163697213,544.454169739038),
  Sales = c(1069.07219862906,1373.52237644098,1264.8627313343,1296.5789629515,738.156919997101,1038.1229145616,1137.74948569075,736.237093154764,599.024686413332,385.314262003617)
)

# ---- Compute Z-Score ----
data <- data %>%
  mutate(
    Z_Score = 1.2 * (Working_Capital / Total_Assets) +
      1.4 * (Retained_Earnings / Total_Assets) +
      3.3 * (EBIT / Total_Assets) +
      0.6 * (Market_Value_Equity / Total_Liabilities) +
      1.0 * (Sales / Total_Assets),
    Zone = case_when(
      Z_Score < 1.8 ~ "Distress",
      Z_Score < 3.0 ~ "Grey Zone",
      TRUE ~ "Safe"
    )
  )


4. Uses of Altman Z-Score

Altman Z-Score is widely applied in:

  • Credit risk assessment: Banks and lenders evaluate the probability of default.
  • Investment analysis: Investors screen companies for potential bankruptcy risk.
  • M&A due diligence: Assess financial health before acquisitions.
  • Portfolio management: Avoid distressed firms or weight them cautiously in portfolios.